Multi-Reference Video Coding Using Stillness Detection

نویسندگان

  • Di Chen
  • Zoe Liu
  • Yaowu Xu
  • Fengqing Zhu
  • Edward Delp
چکیده

Encoders of AOM/AV1 codec consider an input video sequence as succession of frames grouped in Golden-Frame (GF) groups. The coding structure of a GF group is fixed with a given GF group size. In the current AOM/AV1 encoder, video frames are coded using a hierarchical, multilayer coding structure within one GF group. It has been observed that the use of multilayer coding structure may result in worse coding performance if the GF group presents consistent stillness across its frames. This paper proposes a new approach that adaptively designs the Golden-Frame (GF) group coding structure through the use of stillness detection. Our new approach hence develops an automatic stillness detection scheme using three metrics extracted from each GF group. It then differentiates those GF groups of stillness from other nonstill GF groups and uses different GF coding structures accordingly. Experimental result demonstrates a consistent coding gain using the new approach. Introduction The AOM/AV1 codec [1] is an open source, royalty-free video codec developed by a consortium of major technology companies called Alliance for Open Media (AOM) which is jointly founded by Google. It followed the VP9 codec [2, 3], a video codec designed specifically for media on the web by Google WebM Project [4]. The AOM/AV1 codec introduced several new features and coding tools such as switchable loop-restoration [5], global and locally warped motion compensation [6], and variable block-size overlapped block motion compensation [7]. The AOM/AV1 is expected to achieve generational improvement in coding efficiency over VP9. Current AOM/AV1 codec divides the source video frames into Golden-Frame (GF) groups. The length of each GF group, i.e. the GF group interval, may vary according to the video’s spatial or temporal characteristics and other encoder configurations, such as the key frame interval at request for the sake of random access or error resilience. The coding structure of each GF group is based on their interval length and the selection of reference frames buffered for the coding of other frames. The coding structure determines the encoding order of each individual frame within one GF group. In the current implementation of the AOM/AV1 encoder, a GF group may have a length between 4 to 16 frames. Various GF coding structures may be designed depending on the encoder’s decision on the construction of the reference frame buffer, as shown in Figure 1a and Figure 1b. The extra-ALTREF FRAMEs and the BWDREF FRAMEs introduce hierarchical coding structure to the GF groups [8]. The VP9 codec uses three references for motion compensation, namely LAST FRAME, GOLDEN FRAME and ALTREF FRAME. GOLDEN FRAME is the intra prediction frame. LAST FRAME is the forward reference frame. ALTREF FRAME is the backward reference frame selected from a distant future frame. It is the last frame of each GF group. A new coding tool is adopted by AV1 that extends the number of reference frames by adding LAST2 FRAME, LAST3 FRAME, extra-ALTREF FRAME and BWDREF FRAME. LAST2 FRAME and LAST3 FRAME are similar to LAST FRAME. extra-ALTREF FRAME and BWDREF FRAME are backward reference frames in a relatively shorter distance. The main difference is that BWDREF FRAME does not apply temporal filtering. The hierarchical coding structure in Figure 1a may greatly improve the coding efficiency due to its multi-layer, multi-backward reference design. The current AOM/AV1 encoder uses the coding structure shown in Figure 1a for all the GF groups. However, a comparison of the compression performance with extra-ALTREF FRAME and BWDREF FRAME enabled and disabled showed that the coding efficiency for some test videos was actually worse when these two reference frames were enabled. This means that the multilayer coding structure does not always have better coding efficiency for all the GF groups. One such example is the GF groups with stillness feature. In this paper, we propose a new approach that adaptively designs the Golden-Frame (GF) group coding structure through the use of stillness detection. A set of metrics are designed to determine whether the frames in a GF group is of little motion. Little work has been done that investigates the use of difference coding structures depending on video content. In [9], an adaptive video coding control scheme is proposed that suggests using more Pand B-frame while the temporal correlation among the frames in a group of pictures (GOP) are high. A method for using different GOP size based on video content is presented in [10]. Method GF Group Stillness AGF group may be constructed to contain consistent characteristics to differentiate itself from other GF groups. For instance, some GF group may present stillness across its successive frames, and other may present a zoom-in / zoom-out motion across the entire GF group. We examined the coding efficiency and the stillness feature of each GF group and found that when stillness is present in one GF group, the use of multilayer coding structure as shown in Figure 1a may produce worse coding performance, as opposed to that generated by the one layer structure in Figure 1b. Automatic GF Group Stillness Detection An automatic stillness detection of the GF groups is proposed in this paper which allows the GF groups to choose adaptively between two coding structure as shown in Figure 1a and (a) GF Group Coding Structure Using Multilayer (b) GF Group Coding Structure Using One Layer Figure 1. GF Group Coding Structures Figure 1b. Three metrics are extracted from the GF group during the first coding pass of AOM/AV1 to determine the GF group stillness. The first coding pass of AOM/AV1 conducts a fast block matching with integer-pixel accuracy and use only one reference frame, the previous frame. Some motion vector and motion compensation information are collected during the first coding pass. Our proposed stillness detection method uses this information to extract three metrics as described below which requires small amount of computation. It then identifies the thresholds and derives the criteria to classify GF groups into two categories: GF groups of stillness and GF groups of non-stillness. The thresholds are obtained by collecting statistics of the three metrics from GF groups of eight low resolution (cif ) test videos. We manually labeled the stillness or non-stillness of the GF groups. Figure 2 shows the histograms and the thresholds of the three metrics. We intentionally included some test videos that contain GF groups of “stillness-like” characteristics in the non-stillness class because they are more likely to be misclassified as GF group of stillness. The GF group with “stillness-like” characteristics shows either very slow motion or static background with small moving objects. We obtained three criteria which are jointly applied to automatically detect stillness. Finally, the GF group is coded using the workflow given in Figure 3. Stillness Detection Metrics: 1. zero motion acumulator: Minimum of the per-frame percentage of zero-motion inter blocks within one GF group: zero motion accumulator =MIN(pcnt zero motionFi | Fi ∈ S)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Compressed Domain Scene Change Detection Based on Transform Units Distribution in High Efficiency Video Coding Standard

Scene change detection plays an important role in a number of video applications, including video indexing, searching, browsing, semantic features extraction, and, in general, pre-processing and post-processing operations. Several scene change detection methods have been proposed in different coding standards. Most of them use fixed thresholds for the similarity metrics to determine if there wa...

متن کامل

Free Viewpoint Switching in Multi-view Video Streaming Using Wyner-Ziv Video Coding

The free viewpoint switching is one of the most important features of multi-view video streaming. The key problem lies in how to achieve the best performance when the camera processing capability and the network bandwidth are limited. In this paper, we propose a novel free viewpoint switching scheme for multi-view video scenario, in which the distributed video coding technique is employed. In t...

متن کامل

Overview of View Synthesis Prediction for Multi-view Video Coding

With a wide range of viewing angles, the multi-view video can provide more realistic feeling at arbitrary viewpoints. However, because of a huge amount of data from multiple cameras, we need to develop an efficient coding method. One of the promising approaches for multi-view video coding is view synthesis prediction which generates an additional reference frame. In this paper, we explain the p...

متن کامل

Adaptive Spectral Separation Two Layer Coding with Error Concealment for Cell Loss Resilience

This paper addresses the issue of cell loss and its consequent effect on video quality in a packet video system, and examines possible compensative measures. In the system's enconder, adaptive spectral separation is used to develop a two-layer coding scheme comprising a high priority layer to carry essential video data and a low priority layer with data to enhance the video image. A two-step er...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018